-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Distributed] Bump torch version #1225
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1225
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 5e3c3ce with merge base 24d00ea (): NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to bump the nightly vision version to avoid an error.
i.e. this:
VISION_NIGHTLY_VERSION=dev20240901
Bumping to same as PYTORCH_NIGHTLY_VERSION fixes this error for me.
See inline comment.
Vision and PT being on the same nightly pin usually solves this |
I'll also be bumping the version over on ExecuTorch pytorch/executorch#5549 I can match you at 925 |
@metascroy Wondering if you could have a look at the CI issue? Thanks! |
At an initial glance, I'm not sure why it is failing. I'll investigate more after lunch, but I'm wondering if PyTorch changed how custom ops work in the pin bump. |
Cc @zou3519 any recent changes in custom op support that may be relevant to the CI failure here? |
Maybe @jerryzh168 ? |
@kwen2501 I figured out the issue. ExecuTorch uses PyTorch 9/1, so when ExecuTorch installs, it installs 9/1 and the two different PyTorch versions cause an issue. In #1235, I bump the PT pin, but remove the ET parts of the test and it passes. So before PT pin can be updated, it needs to be updated in ET first and the PT/ET pin then need to be updated together in torchchat cc @Jack-Khuu |
Thanks @metascroy !
@Jack-Khuu Looks like this bump here needs to wait for your bump in ET to land first :) |
Unified pin bump #1269 |
Distributed inference (pipeline parallel part, to be specific) requires two features landed in pytorch nightly:
[Pipelining] Make PipelineStage support meta initialization
[Pipelining] Allow non-0 stages to accept kwargs
Thus bumping the torch version to
2.6.0.dev20240925
.Tested locally.
Cc: @Jack-Khuu @lessw2020